Skip to content

[pull] master from DataDog:master#583

Merged
pull[bot] merged 2 commits into
ConnectionMaster:masterfrom
DataDog:master
Jun 5, 2026
Merged

[pull] master from DataDog:master#583
pull[bot] merged 2 commits into
ConnectionMaster:masterfrom
DataDog:master

Conversation

@pull

@pull pull Bot commented Jun 5, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

sangeetashivaji and others added 2 commits June 4, 2026 20:54
* Add ClickHouse schema metrics (schema_metrics + view refresh)

Adds per-table size gauges via ClickhouseTableMetrics job class and
per-view refresh status/timing gauges collected inline in check().
Introduces schema_metrics config block and _VIEW_REFRESHES_QUERY
with graceful handling for ClickHouse < 24.3 and missing permissions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(clickhouse): rename changelog entry to match PR number 23900

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(clickhouse): sync SchemaMetrics model formatting with spec.yaml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(clickhouse): gate view refresh metrics on schema_metrics.enabled

_collect_view_refresh_metrics was running on every check cycle when dbm
was enabled, regardless of schema_metrics.enabled (which defaults to false).
Gate it on self.table_metrics so it respects the opt-in config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(clickhouse): avoid duplicate db: tag on schema metrics

Per-table and per-view schema metrics appended `db:<database>` on top of the
instance-level `db:<connection_db>` base tag, producing two conflicting `db:`
tags for any entity outside the connection database. Strip the base `db:` tag
before adding the entity's own database, mirroring the postgres
`tags_without_db` pattern, so each series carries exactly one `db:` tag.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Address review feedback: null guard, feature flag, constant naming, tests

- Remove redundant schema_metrics null check (always initialized with defaults)
- Register FeatureKey.SCHEMA_METRICS in config.py feature reporting
- Rename _DEFAULT_COLLECTION_INTERVAL → DEFAULT_COLLECTION_INTERVAL per AGENTS.md
- Add restart hint to permission-denied log message
- Add unit tests for _collect_view_refresh_metrics() and _handle_view_refreshes_error()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Move view refresh logic into ClickhouseTableMetrics

_collect_view_refresh_metrics and _handle_view_refreshes_error now live
on ClickhouseTableMetrics and run inside run_job(), sharing the same
collection interval and DBMAsyncJob lifecycle as table size gauges.
State flags and query/status-map constants move to table_metrics.py.
clickhouse.py loses the duplicate if-table_metrics block, the two
module-level constants, the class attribute, and the three init flags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix run_job() table-size tests silently exercising view refresh path

_patch_query now patches execute_query_raw to return [] alongside
_execute_query, so the view refresh half of run_job() succeeds cleanly
instead of swallowing a connection error. The cluster routing test also
patches execute_query_raw and gains an assertion that view refresh
queries go through clusterAllReplicas too.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix ruff formatting: collapse single-line format() call in table_metrics.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Restore trailing spaces in conf.yaml.example stripped by ruff

ddev validate config requires spec-generated trailing spaces on
'default: ' and 'For example: ' comment lines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix view_refresh metrics to emit per-replica service checks

In single_endpoint_mode, clusterAllReplicas returns one row per replica per
view. The previous seen dedup keyed on (database, view) silently discarded
all but one replica's status, which could hide per-replica Error states.

Now keys seen on (database, view, host) and tags each series with host: so
each replica surfaces its own refresh status independently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Use LIMIT 1 BY in table sizes query instead of Python-side dedup

clusterAllReplicas returns one row per replica per table; pushing the
dedup into SQL with LIMIT 1 BY database, name is cleaner than filtering
in Python.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix test_table_metrics for host tag and SQL-side dedup changes

- Add host field to _view_refresh_row fixture (query now returns hostName())
- Assert host: tag appears in view refresh service check
- Replace Python-side dedup test with assertion that LIMIT 1 BY appears in
  the table sizes query, since dedup is now the SQL's responsibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Use LIMIT 1 BY in view refreshes query instead of Python-side dedup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove stale Python-side dedup test for view refreshes

Deduplication is now enforced by LIMIT 1 BY in the SQL query.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Fix ruff: remove extra blank line in test_table_metrics.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Remove view.refresh service check in favour of view.refresh.status gauge

Service checks are slated for deprecation; new ones should not be added.
The view.refresh.status gauge already covers the same OK/WARNING/CRITICAL/UNKNOWN
signal, so the service_check() call is redundant.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ons (#23926)

* [SDBM-2635] postgres: include dbms=postgresql in SQL obfuscation options

Both statement_samples.py and statements.py construct an obfuscate_options dict
from ObfuscatorOptions.model_dump() and patch in key renames (table_names,
dollar_quoted_func, return_json_metadata). The dbms field was never included,
so the Go obfuscator received DBMS="" instead of "postgresql".

While no PostgreSQL-specific tokenizer branches exist today, explicitly setting
dbms ensures parity with dbm-logs-processor and protects against future
PostgreSQL-specific lexer paths being added in go-sqllexer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [SDBM-2635] postgres: add changelog entry for dbms obfuscation option fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [SDBM-2635] postgres: test that obfuscate_sql receives dbms=postgresql option

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [SDBM-2635] postgres: strengthen obfuscation options test and fix changelog

- Assert directly on _obfuscate_options attributes of both classes to
  unconditionally cover both code paths regardless of DB query activity
- Also assert return_json_metadata=True alongside dbms=postgresql
- Fix changelog entry missing trailing period

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* [SDBM-2635] postgres: fix line length in test_obfuscate_sql_options

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@pull pull Bot locked and limited conversation to collaborators Jun 5, 2026
@pull pull Bot added the ⤵️ pull label Jun 5, 2026
@pull pull Bot merged commit 9311b89 into ConnectionMaster:master Jun 5, 2026
@pull pull Bot had a problem deploying to typo-squatting-release June 5, 2026 07:08 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants